Search CORE

4 research outputs found

Context-aware Deep Model for Entity Recommendation in Search Engine at Alibaba

Author: Hua Nengwei
Jia Qianghuai
Zhang Ningyu
Publication venue
Publication date: 06/09/2019
Field of study

Entity recommendation, providing search users with an improved experience via assisting them in finding related entities for a given query, has become an indispensable feature of today's search engines. Existing studies typically only consider the queries with explicit entities. They usually fail to handle complex queries that without entities, such as "what food is good for cold weather", because their models could not infer the underlying meaning of the input text. In this work, we believe that contexts convey valuable evidence that could facilitate the semantic modeling of queries, and take them into consideration for entity recommendation. In order to better model the semantics of queries and entities, we learn the representation of queries and entities jointly with attentive deep neural networks. We evaluate our approach using large-scale, real-world search logs from a widely used commercial Chinese search engine. Our system has been deployed in ShenMa Search Engine and you can fetch it in UC Browser of Alibaba. Results from online A/B test suggest that the impression efficiency of click-through rate increased by 5.1% and page view increased by 5.5%.Comment: CIKM2019 International Workshop on Entity Retrieval. arXiv admin note: text overlap with arXiv:1511.08996 by other author

arXiv.org e-Print Archive

Conceptualized Representation Learning for Chinese Biomedical Text Mining

Author: Dong Liang
Gao Feng
Hua Nengwei
Jia Qianghuai
Yin Kangping
Zhang Ningyu
Publication venue
Publication date: 25/08/2020
Field of study

Biomedical text mining is becoming increasingly important as the number of biomedical documents and web data rapidly grows. Recently, word representation models such as BERT has gained popularity among researchers. However, it is difficult to estimate their performance on datasets containing biomedical texts as the word distributions of general and biomedical corpora are quite different. Moreover, the medical domain has long-tail concepts and terminologies that are difficult to be learned via language models. For the Chinese biomedical text, it is more difficult due to its complex structure and the variety of phrase combinations. In this paper, we investigate how the recently introduced pre-trained language model BERT can be adapted for Chinese biomedical corpora and propose a novel conceptualized representation learning approach. We also release a new Chinese Biomedical Language Understanding Evaluation benchmark (\textbf{ChineseBLUE}). We examine the effectiveness of Chinese pre-trained models: BERT, BERT-wwm, RoBERTa, and our approach. Experimental results on the benchmark show that our approach could bring significant gain. We release the pre-trained model on GitHub: https://github.com/alibaba-research/ChineseBLUE.Comment: WSDM2020 Health Da

arXiv.org e-Print Archive

A Concept Knowledge Graph for User Next Intent Prediction at Alipay

Author: He Yacheng
Jia Qianghuai
Li Ruopeng
Ou Yixin
Yuan Lin
Zhang Ningyu
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 14/03/2023
Field of study

This paper illustrates the technologies of user next intent prediction with a concept knowledge graph. The system has been deployed on the Web at Alipay, serving more than 100 million daily active users. To explicitly characterize user intent, we propose AlipayKG, which is an offline concept knowledge graph in the Life-Service domain modeling the historical behaviors of users, the rich content interacted by users and the relations between them. We further introduce a Transformer-based model which integrates expert rules from the knowledge graph to infer the online user's next intent. Experimental results demonstrate that the proposed system can effectively enhance the performance of the downstream tasks while retaining explainability.Comment: Accepted by WWW 2023 poste

arXiv.org e-Print Archive

AliCG: Fine-grained and Evolvable Conceptual Graph Construction for Semantic Search at Alibaba

Author: Chen Huajun
Chen Hui
Chen Xiang
Deng Shumin
Hua Nengwei
Huang Gang
Jia Qianghuai
Tou Huaixiao
Wang Zhao
Ye Hongbin
Zhang Ningyu
Publication venue
Publication date: 07/12/2021
Field of study

Conceptual graphs, which is a particular type of Knowledge Graphs, play an essential role in semantic search. Prior conceptual graph construction approaches typically extract high-frequent, coarse-grained, and time-invariant concepts from formal texts. In real applications, however, it is necessary to extract less-frequent, fine-grained, and time-varying conceptual knowledge and build taxonomy in an evolving manner. In this paper, we introduce an approach to implementing and deploying the conceptual graph at Alibaba. Specifically, We propose a framework called AliCG which is capable of a) extracting fine-grained concepts by a novel bootstrapping with alignment consensus approach, b) mining long-tail concepts with a novel low-resource phrase mining approach, c) updating the graph dynamically via a concept distribution estimation method based on implicit and explicit user behaviors. We have deployed the framework at Alibaba UC Browser. Extensive offline evaluation as well as online A/B testing demonstrate the efficacy of our approach.Comment: Accepted by KDD 2021 (Applied Data Science Track

arXiv.org e-Print Archive